UralicNLP: An NLP Library for Uralic Languages
نویسندگان
چکیده
منابع مشابه
Morphological Tools for Six Small Uralic Languages
This article presents a set of morphological tools for six small endangered minority languages belonging to the Uralic language family, Udmurt, Komi, Eastern Mari, Northern Mansi, Tundra Nenets and Nganasan. Following an introduction to the languages, the two sets of tools used in the project (MorphoLogic’s Humor tools and the Xerox Finite State Tool) are described and compared. The article is ...
متن کاملComputational Morphologies for Small Uralic Languages
This article presents a set of morphological tools for small Uralic languages. Various Hungarian research groups specialized in Finno-Ugric linguistics and a Hungarian language technology company (MorphoLogic) have initiated a project with the goal of producing annotated electronic corpora for small Uralic languages. The languages described include Mordvin, Udmurt (Votyak), Komi (Zyryan), Mansi...
متن کاملLanguages under the influence: Building a database of Uralic languages
For most of the Uralic languages, there is a lack of systematically collected, consequently transcribed and morphologically annotated text corpora. This paper sums up the steps, the preliminary results and the future directions of building a linguistic corpus of some Uralic languages, namely Tundra Nenets, Udmurt, Synya Khanty, and Surgut Khanty. The experiences of building a corpus containing ...
متن کاملDiffSharp: An AD Library for .NET Languages
DiffSharp is an algorithmic differentiation (AD) library for the .NET ecosystem, which is targeted by the C# and F# languages, among others. The library has been designed with machine learning applications in mind [1], allowing very succinct implementations of models and optimization routines. DiffSharp is implemented in F# and exposes forward and reverse AD operators as general nestable higher...
متن کاملA Finite-State Library for NLP
A library of functions is described which use finite-state automata for compact storage and efficient usage of very large dictionaries and language models. The library can be used to test whether a word is in a dictionary, to perform morphological analysis, to construct perfect hash tables, and to construct and use very large language models (such as models which employ bigram and trigram frequ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Open Source Software
سال: 2019
ISSN: 2475-9066
DOI: 10.21105/joss.01345